Statistical Proof
   HOME

TheInfoList



OR:

Statistical proof is the rational demonstration of degree of certainty for a
proposition In logic and linguistics, a proposition is the meaning of a declarative sentence. In philosophy, " meaning" is understood to be a non-linguistic entity which is shared by all sentences with the same meaning. Equivalently, a proposition is the no ...
,
hypothesis A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obse ...
or
theory A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be s ...
that is used to convince others subsequent to a
statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
of the supporting
evidence Evidence for a proposition is what supports this proposition. It is usually understood as an indication that the supported proposition is true. What role evidence plays and how it is conceived varies from field to field. In epistemology, evidenc ...
and the types of
inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...
s that can be drawn from the test scores. Statistical methods are used to increase the understanding of the facts and the proof demonstrates the
validity Validity or Valid may refer to: Science/mathematics/statistics: * Validity (logic), a property of a logical argument * Scientific: ** Internal validity, the validity of causal inferences within scientific studies, usually based on experiments ** ...
and logic of inference with explicit reference to a hypothesis, the
experimental data Experimental data in science and engineering is data produced by a measurement, test method, experimental design or quasi-experimental design. In clinical research any data produced are the result of a clinical trial. Experimental data may be qua ...
, the facts, the test, and the
odds Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics. Odds also have ...
.
Proof Proof most often refers to: * Proof (truth), argument or sufficient evidence for the truth of a proposition * Alcohol proof, a measure of an alcoholic drink's strength Proof may also refer to: Mathematics and formal logic * Formal proof, a con ...
has two essential aims: the first is to convince and the second is to explain the proposition through peer and public review. The burden of proof rests on the demonstrable application of the statistical method, the disclosure of the assumptions, and the relevance that the test has with respect to a genuine understanding of the data relative to the external world. There are adherents to several different statistical philosophies of inference, such as
Bayes theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
versus the
likelihood function The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
, or
positivism Positivism is an empiricist philosophical theory that holds that all genuine knowledge is either true by definition or positive—meaning ''a posteriori'' facts derived by reason and logic from sensory experience.John J. Macionis, Linda M. G ...
versus
critical rationalism Critical rationalism is an epistemological philosophy advanced by Karl Popper on the basis that, if a statement cannot be logically deduced (from what is known), it might nevertheless be possible to logically falsify it. Following Hume, Popper r ...
. These methods of reason have direct bearing on statistical proof and its interpretations in the broader philosophy of science. A common demarcation between science and
non-science A non-science is an area of study that is not scientific, especially one that is not a natural science or a social science that is an object of scientific inquiry. In this model, history, art, and religion are all examples of non-sciences. Clas ...
is the
hypothetico-deductive The hypothetico-deductive model or method is a proposed description of the scientific method. According to it, scientific inquiry proceeds by formulating a hypothesis in a form that can be falsifiable, using a test on observable data where the out ...
proof of falsification developed by
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian-British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the cl ...
, which is a well-established practice in the tradition of statistics. Other modes of inference, however, may include the inductive and
abductive Abductive reasoning (also called abduction,For example: abductive inference, or retroduction) is a form of logical inference formulated and advanced by American philosopher Charles Sanders Peirce beginning in the last third of the 19th centu ...
modes of proof. Scientists do not use statistical proof as a means to attain certainty, but to
falsify Falsifiability is a standard of evaluation of scientific theories and hypotheses that was introduced by the philosopher of science Karl Popper in his book ''The Logic of Scientific Discovery'' (1934). He proposed it as the cornerstone of a sol ...
claims and explain theory. Science cannot achieve absolute certainty nor is it a continuous march toward an objective truth as the vernacular as opposed to the scientific meaning of the term "proof" might imply. Statistical proof offers a kind of proof of a theory's falsity and the means to learn heuristically through repeated statistical trials and experimental error. Statistical proof also has applications in legal matters with implications for the
legal burden of proof In a legal dispute, one party has the burden of proof to show that they are correct, while the other party had no such burden and is presumed to be correct. The burden of proof requires a party to produce evidence to establish the truth of facts ...
.


Axioms

There are two kinds of
axioms An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
, 1) conventions that are taken as true that should be avoided because they cannot be tested, and 2) hypotheses. Proof in the theory of probability was built on four axioms developed in the late 17th century: #The probability of a hypothesis is a non-negative real number: \bigg\; #The probability of necessary truth equals one: \bigg\; #If two hypotheses h1 and h2 are mutually exclusive, then the sum of their probabilities is equal to the probability of their
disjunction In logic, disjunction is a logical connective typically notated as \lor and read aloud as "or". For instance, the English language sentence "it is raining or it is snowing" can be represented in logic using the disjunctive formula R \lor S ...
: \bigg\; #The conditional probability of h1 given h2 \Bigg\ is equal to the unconditional probability \bigg\ of the conjunction h1 and h2, divided by the unconditional probability \bigg\ of h2 where that probability is positive \bigg\, where \bigg\. The preceding axioms provide the statistical proof and basis for the
laws Law is a set of rules that are created and are law enforcement, enforceable by social or governmental institutions to regulate behavior,Robertson, ''Crimes against humanity'', 90. with its precise definition a matter of longstanding debate. ...
of randomness, or objective chance from where modern statistical theory has advanced. Experimental data, however, can never prove that the hypotheses (h) is true, but relies on an inductive inference by measuring the probability of the hypotheses relative to the empirical data. The proof is in the rational demonstration of using the logic of inference,
math Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
,
testing An examination (exam or evaluation) or test is an educational assessment intended to measure a test-taker's knowledge, skill, aptitude, physical fitness, or classification in many other topics (e.g., beliefs). A test may be administered verba ...
, and
deductive Deductive reasoning is the mental process of drawing deductive inferences. An inference is deductively valid if its conclusion follows logically from its premises, i.e. if it is impossible for the premises to be true and the conclusion to be false ...
reason Reason is the capacity of consciously applying logic by drawing conclusions from new or existing information, with the aim of seeking the truth. It is closely associated with such characteristically human activities as philosophy, science, ...
ing of significance.


Test and proof

The term ''proof'' descended from its Latin roots (provable, probable, ''probare'' L.) meaning ''to test''. Hence, proof is a form of inference by means of a statistical test. Statistical tests are formulated on models that generate
probability distributions In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
. Examples of probability distributions might include the
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
,
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
, or
poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known co ...
that give exact descriptions of variables that behave according to
natural law Natural law ( la, ius naturale, ''lex naturalis'') is a system of law based on a close observation of human nature, and based on values intrinsic to human nature that can be deduced and applied independently of positive law (the express enacte ...
s of
random chance In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no :wikt:order, order and does not follow an intelligible pattern or combination. Ind ...
. When a
statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
is applied to samples of a population, the test determines if the sample statistics are significantly different from the assumed null-model. True values of a population, which are unknowable in practice, are called parameters of the population. Researchers sample from populations, which provide estimates of the parameters, to calculate the mean or standard deviation. If the entire population is sampled, then the sample statistic mean and distribution will converge with the parametric distribution. Using the scientific method of falsification, the probability value that the sample statistic is sufficiently different from the null-model than can be explained by chance alone is given prior to the test. Most statisticians set the prior probability value at 0.05 or 0.1, which means if the sample statistics diverge from the parametric model more than 5 (or 10) times out of 100, then the discrepancy is unlikely to be explained by chance alone and the null-hypothesis is rejected. Statistical models provide exact outcomes of the parametric and estimates of the sample statistics. Hence, the burden of proof rests in the sample statistics that provide estimates of a statistical model. Statistical models contain the
mathematical proof A mathematical proof is an inferential argument for a mathematical statement, showing that the stated assumptions logically guarantee the conclusion. The argument may use other previously established statements, such as theorems; but every proo ...
of the parametric values and their probability distributions.


Bayes theorem

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
are based on a different philosophical approach for proof of
inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...
. The mathematical formula for Bayes's theorem is: Pr Data= \frac The formula is read as the probability of the parameter (or hypothesis ''=h'', as used in the notation on
axioms An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
) “given” the data (or empirical observation), where the horizontal bar refers to "given". The right hand side of the formula calculates the prior probability of a statistical model (Pr arameter with the
likelihood The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
(Pr Parameter to produce a posterior probability distribution of the parameter (Pr Data. The posterior probability is the likelihood that the parameter is correct given the observed data or samples statistics. Hypotheses can be compared using Bayesian inference by means of the Bayes factor, which is the ratio of the posterior odds to the prior odds. It provides a measure of the data and if it has increased or decreased the likelihood of one hypotheses relative to another. The statistical proof is the Bayesian demonstration that one hypothesis has a higher (weak, strong, positive) likelihood. There is considerable debate if the Bayesian method aligns with Karl Poppers method of proof of falsification, where some have suggested that "...there is no such thing as "accepting" hypotheses at all. All that one does in science is assign degrees of belief..." According to Popper, hypotheses that have withstood testing and have yet to be falsified are not verified but
corroborated Corroborating evidence, also referred to as corroboration, is a type of evidence in law. Types and uses Corroborating evidence tends to support a proposition that is already supported by some initial evidence, therefore confirming the propositio ...
. Some researches have suggested that Popper's quest to define corroboration on the premise of probability put his philosophy in line with the Bayesian approach. In this context, the likelihood of one hypothesis relative to another may be an index of corroboration, not confirmation, and thus statistically proven through rigorous objective standing.


In legal proceedings

Statistical proof in a legal proceeding can be sorted into three categories of evidence: #The occurrence of an event, act, or type of conduct, #The identity of the individual(s) responsible #The intent or psychological responsibility Statistical proof was not regularly applied in decisions concerning United States legal proceedings until the mid 1970s following a landmark jury discrimination case in ''Castaneda v. Partida''. The US Supreme Court ruled that gross statistical disparities constitutes "''
prima facie ''Prima facie'' (; ) is a Latin expression meaning ''at first sight'' or ''based on first impression''. The literal translation would be 'at first face' or 'at first appearance', from the feminine forms of ''primus'' ('first') and ''facies'' (' ...
'' proof" of discrimination, resulting in a shift of the burden of proof from plaintiff to defendant. Since that ruling, statistical proof has been used in many other cases on inequality, discrimination, and DNA evidence. However, there is not a one-to-one correspondence between statistical proof and the legal burden of proof. "The Supreme Court has stated that the degrees of rigor required in the fact finding processes of law and science do not necessarily correspond." In an example of a death row sentence (''McCleskey v. Kemp'') concerning racial discrimination, the petitioner, a black man named McCleskey was charged with the murder of a white police officer during a robbery. Expert testimony for McClesky introduced a statistical proof showing that "defendants charged with killing white victims were 4.3 times as likely to receive a death sentence as charged with killing blacks.". Nonetheless, the statistics was insufficient "to prove that the decisionmakers in his case acted with discriminatory purpose." It was further argued that there were "inherent limitations of the statistical proof", because it did not refer to the specifics of the individual. Despite the statistical demonstration of an increased probability of discrimination, the legal burden of proof (it was argued) had to be examined on a case by case basis.


See also

*
Mathematical proof A mathematical proof is an inferential argument for a mathematical statement, showing that the stated assumptions logically guarantee the conclusion. The argument may use other previously established statements, such as theorems; but every proo ...
*
Data analysis Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, enco ...


References


Notes

{{reflist, group="nb" Logic and statistics